A prediction model based machine learning algorithms with feature selection approaches over imbalanced dataset
نویسندگان
چکیده
The educational sector faced many types of research in predicting student performance based on supervised and unsupervised machine learning algorithms. Most students' data are imbalanced, where the final classes not equally represented. Besides size dataset, this problem affects model's prediction accuracy. In paper, Synthetic Minority Oversampling Technique (SMOTE) filter is applied to dataset find its effect Four feature selection approaches most correlated attributes that affect performance. SMOTE examined before after applying measure accuracy with Three supervised/unsupervised algorithms predict findings show (LMT, Simple Logistic, Random Forest) got high without selection. accuracies (Canopy, EM, Farthest First) enhanced filter.
منابع مشابه
Sports Result Prediction Based on Machine Learning and Computational Intelligence Approaches: A Survey
In the current world, sports produce considerable statistical information about each player, team, games, and seasons. Traditional sports science believed science to be owned by experts, coaches, team managers, and analyzers. However, sports organizations have recently realized the abundant science available in their data and sought to take advantage of that science through the use of data mini...
متن کاملProtein Secondary Structure Prediction: a Literature Review with Focus on Machine Learning Approaches
DNA sequence, containing all genetic traits is not a functional entity. Instead, it transfers to protein sequences by transcription and translation processes. This protein sequence takes on a 3D structure later, which is a functional unit and can manage biological interactions using the information encoded in DNA. Every life process one can figure is undertaken by proteins with specific functio...
متن کاملCorrelation-based Feature Selection for Machine Learning
A central problem in machine learning is identifying a representative set of features from which to construct a classification model for a particular task. This thesis addresses the problem of feature selection for machine learning through a correlation based approach. The central hypothesis is that good feature sets contain features that are highly correlated with the class, yet uncorrelated w...
متن کاملA Pareto-based Ensemble with Feature and Instance Selection for Learning from Multi-Class Imbalanced Datasets
Imbalanced classification is related to those problems that have an uneven distribution among classes. In addition to the former, when instances are located into the overlapped areas, the correct modeling of the problem becomes harder. Current solutions for both issues are often focused on the binary case study, as multi-class datasets require an additional effort to be addressed. In this resea...
متن کاملProstate cancer radiomics: A study on IMRT response prediction based on MR image features and machine learning approaches
Introduction: To develop different radiomic models based on radiomic features and machine learning methods to predict early intensity modulated radiation therapy (IMRT) response. Materials and Methods: Thirty prostate patients were included. All patients underwent pre ad post-IMRT T2 weighted and apparent diffusing coefficient (ADC) magnetic resonance imagi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Indonesian Journal of Electrical Engineering and Computer Science
سال: 2022
ISSN: ['2502-4752', '2502-4760']
DOI: https://doi.org/10.11591/ijeecs.v28.i2.pp1105-1116